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Abstract 

The inverse Omori law for foreshocks discovered in the 1970s states that the rate of earthquakes 
prior to a mainshock increases on average as a power law oc l/(t c -t) p of the t ime to the 
mainshock occurring at t c . Here, we show that this law results from the direct Omori law for 
aftershocks describing the power law decay ~ 1/ (t — t c ) p of seismicity after an earthquake, 
provided that any earthquake can trigger its suit of aftershocks. In this picture, the seismic 
activity at any time is the sum of the spontaneous tectonic loading and of the activity triggered 
by all preceding events weighted by their corresponding Omori law. The inverse Omori law then 
emerges as the expected (in a statistical sense) trajectory of seismicity, conditioned on the fact 
that it leads to the burst of seismic activity accompanying the mainshock. In particular, we 
predict and verify by numerical simulations on the Epidemic- Type-Aftershock Sequence (ETAS) 
model that p' is always smaller than or equal to p and a function of p, of the 6-value of the 
Gutenberg-Richter law (GR) and of a parameter quantifying the number of direct aftershocks as 
a function of the magnitude of the mainshock. The often documented apparent decrease of the 
6-value of the GR law at the approach to the main shock results straightforwardly from the 
conditioning of the path of seismic activity culminating at the mainshock. However, we predict 
that the GR law is not modified simply by a change of 6-value but that a more accurate 
statement is that the GR law gets an additive (or deviatoric) power law contribution with 
exponent smaller than b and with an amplitude growing as a power law of the time to the 
mainshock. In the space domain, we predict that the phenomenon of aftershock diffusion must 
have its mirror process reflected into an inward migration of foreshocks towards the mainshock. 
In this model, foreshock sequences are special aftershock sequences which are modified by the 
condition to end up in a burst of seismicity associated with the mainshock. Foreshocks are not 



just statistical creatures, they are genuine forerunners of large shocks as shown by the large 
prediction gains obtained using several of their qualifiers. 
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1. Introduction 

Large shallow earthquakes are followed by an in- 
crease in seismic activity, defined as an aftershock se- 
quence. It is also well-known that large earthquakes 
are sometimes preceded by an unusually large activ- 
ity rate, defined as a foreshock sequence. Omori law 
describing the power law decay <~ l/(t — t c ) p of af- 
tershock rate with time from a mainshock that oc- 
curred at t c has been proposed more than one cen- 
tury ago [Omori, 1894], and has since been verified 
by many studies [Kagan and Knopoff, 1978; Davis 
and Frohlich, 1991; Kisslinger and Jones, 1991; Utsu 
et at, 1995]. See however [Kisslinger, 1993; Gross 
and Kisslinger, 1994] for alternative decay laws such 
as the stretched exponential and its possible explana- 
tion [Helmstetter and Sornette, 2002a]. 

Whereas the Omori law describing the aftershock 
decay rate is one of the few well-established empirical 
laws in seismology, the increase of foreshock rate be- 
fore an earthquake does not follow such a well-defined 
empirical law. There are huge fluctuations of the fore- 
shock seismicity rate, if any, from one sequence of 
earthquakes to another one preceding a mainshock. 
Moreover, the number of forcshocks per mainshock is 
usually quite smaller than the number of aftershocks. 
It is thus essentially impossible to establish a deter- 
ministic empirical law that describes the intermittent 
increase of seismic activity prior to a mainshock when 
looking at a single foreshock sequence which contains 
at best a few events. Although well-developed in- 
dividual foreshock sequences are rare and mostly ir- 
regular, a well-defined acceleration of foreshock rate 
prior to a mainshock emerges when using a super- 
posed epoch analysis, in other words, by synchroniz- 
ing several foreshock sequences to a common origin of 
time defined as the time of their mainshocks and by 
stacking these synchronized foreshock sequences. In 
this case, the acceleration of the seismicity preceding 
the mainshock clearly follows an inverse Omori law 
of the form N(t) <~ l/(t c — t) p , where t c is the time 
of the mainshock. This law has been first proposed 
by Papazachos [1973], and has been established more 
firmly by [Kagan and Knopoff, 1978; Jones and Mol- 
nar, 1979]. The inverse Omori law is usually observed 
for time scales smaller than the direct Omori law, of 
the order of weeks to months before the mainshock. 

A clear identification of forcshocks, aftershocks 
and mainshocks is hindered by the difficulties in as- 
sociating an unambiguous and unique space-time- 
magnitude domain to any earthquake sequence. Iden- 



tifying aftershocks and foreshocks requires the defini- 
tion of a space-time window. All events in the same 
space-time domain define a sequence. The largest 
earthquake in the sequence is called the mainshock. 
The following events are identified as aftershocks, and 
the preceding events are called foreshocks. 

Large aftershocks show the existence of secondary 
aftershock activities, that is, the fact that aftershocks 
may have their own aftershocks, such as the M — 6.5 
Big Bear event, which is considered as an aftershock 
of the M — 7.2 Landers Californian earthquake, and 
which clearly triggered its own aftershocks. Of course, 
the aftershocks of aftershocks can be clearly identi- 
fied without further insight and analysis as obvious 
bursts of transient seismic activity above the back- 
ground seismicity level, only for the largest after- 
shocks. But because aftershocks exist on all scales, 
from the laboratory scale, e.g. [Mogi, 1967; Scholz, 
1968], to the worldwide seismicity, we may expect 
that all earthquakes, whatever their magnitude, trig- 
ger their own aftershocks, but with a rate increasing 
with the mainshock magnitude, so that only after- 
shocks of the largest earthquakes are identifiable un- 
ambiguously. 

The properties of aftershock and foreshock se- 
quences depend on the choice of these space-time win- 
dows, and on the specific definition of foreshocks [e.g. 
Ogata et at, 1996], which can sometimes be rather 
arbitrary. In the sequel, we shall consider two defini- 
tions of foreshocks for a given space and time window: 

1. we shall call "foreshock" of type I any event of 
magnitude smaller than or equal to the magni- 
tude of the following event, then identified as a 
"main shock" . This definition implies the choice 
of a space-time window R x TT used to define 
both forcshocks and mainshocks. Mainshocks 
are large earthquakes that were not preceded by 
a larger event in this space-time window. The 
same window is used to select foreshocks before 
mainshocks; 

2. we shall also consider "foreshock" of type II, as 
any earthquake preceding a large earthquake, 
defined as the mainshock, independently of the 
relative magnitude of the foreshock compared to 
that of the mainshock. This second definition 
will thus incorporate seismic sequences in which 
a foreshock could have a magnitude larger than 
the mainshock, a situation which can alterna- 
tively be interpreted as a mainshock followed 
by a large aftershock. 
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The advantage of this second definition is that fore- 
shocks of type II are automatically defined as soon 
as one has identified the mainshocks, for instance, 
by calling mainshocks all events of magnitudes larger 
than some threshold of interest. Forcshocks of type 
II are thus all events preceding these large magni- 
tude mainshocks. In contrast, foreshocks of type I 
need to obey a constraint on their magnitude, which 
may be artificial, as suggested from the previous dis- 
cussion. All studies published in the literature deal 
with forcshocks of type I. Using a very simple model 
of seismicity, the so-called ETAS (epidemic-type af- 
tershock) model, we shall show that the definition 
of forcshocks of type II is also quite meaningful and 
provides new insights for classifying earthquake phe- 
nomenology and understanding earthquake clustering 
in time and space. 

The exponent p' of the inverse Omori law is usu- 
ally found to be smaller than or close to 1 [Papaza- 
chos et al, 1967; Papazachos et al, 1975b; Kagan 
and Knopoff, 1978; Jones and Molnar, 1979; Davis 
and Frohlich, 1991; Shaw, 1993; Ogata et al, 1995; 
Maeda, 1999; Reasenberg, 1999], and is always found 
smaller than or equal to the direct Omori exponent p 
when the 2 exponents p and p' are measured simulta- 
neously on the same mainshocks [Kagan and Knopo ff, 
1978; Davis and Frohlich, 1991; Shaw, 1993; Maeda, 
1999; Reasenberg, 1999]. Shaw [1993] suggested in a 
peculiar case the relationship p' — 2p — 1 , based on a 
clever but slightly incorrect reasoning (see below). We 
shall recover below this relationship only in a certain 
regime of the ETAS model from an exact treatment 
of the forcshocks of type II within the framework of 
the ETAS model. 

Other studies tried to fit a power law increase of 
seismicity to individual foreshock sequences. Rather 
than the number of foreshocks, these studies usually 
fit the cumulative Bcnioff strain release e by a power- 
law e(t) = e c — B(t c — i) z with an exponent z that is 
often found close to 0.3 (see [Jaume and Sykes, 1999; 
Sammis and Sornette, 2002] for reviews). Assuming a 
constant Gutenberg Richter 6-value through time, so 
that the acceleration of the cumulative Bcnioff strain 
before the mainshock is due only to the increase in 
the seismicity rate, this would argue for a p'-value 
close to 0.7. These studies were often motivated by 
the critical point theory [Sornette and Sammis, 1995], 
which predicts a power-law increase of seismic activ- 
ity before major earthquakes (see e.g. [Sammis and 
Sornette, 2002] for a review). However, the statistical 
significance of such a power-acceleration of energy be- 



fore individual mainshock is still controversial [Z oiler 
and Hainzl, 2002]. 

The frequency-size distribution of foreshocks has 
also been observed either to be different from that 
of aftershocks, b' < b, e.g. [Suyehiro, 1966; Pa- 
pazachos et al, 1967; Ikegami, 1967; Berg, 1968], 
or to change as the mainshock is approached. This 
change of magnitude distribution is often interpreted 
as a decrease of 6-value, first reported by [Kagan and 
Knopoff, 1978; Li et al, 1978; Wu et al, 1978]. Oth- 
ers studies suggest that the modification of the mag- 
nitude distribution is due only to moderate or large 
events, whereas the distribution of small magnitude 
events is not modified [Rotwain et al, 1997; Jaume 
and Sykes, 1999]. Knopoff et al. [1982] state that only 
in the rare cases of catalogs of great length, statisti- 
cally significant smaller 6-value for foreshocks than for 
aftershocks are found. Nevertheless they believe the 
effect is likely to be real in most catalogs, but at a 
very low level of difference. 

On the theoretical front, there have been several 
models developed to account for foreshocks. Because 
forcshocks are rare and seem the forerunners of large 
events, a natural approach is to search for physical 
mechanisms that may explain their specificity. And, 
if there is a specificity, this might lead to the use 
of foreshocks as precursory patterns for earthquake 
prediction. Foreshocks may result from a slow sub- 
critical weakening by stress corrosion [ Yamashita and 
Knopoff, 1989, 1992; Shaw, 1993] or from a general 
damage process [Sornette et al, 1992]. The same 
mechanism can also reproduce aftershock behavior 
[Yamashita and Knopoff, 1987; Shaw, 1993]. Fore- 
shocks and aftershocks may result also from the dy- 
namics of stress distribution on pre-existing hierar- 
chical structures of faults or tectonic blocks [Huang 
et al, 1998; Gabrielov et al, 2000a, b; Narteau et al, 
2000] , when assuming that the scale over which stress 
redistribution occurs is controlled by the level of the 
hierarchy (cell size in a hierarchical cellular automa- 
ton model). Dodge et al. [1996] argue that foreshocks 
are a byproduct of an aseismic nuclcation process of 
a mainshock. Other possible mechanisms for both af- 
tershocks and foreshocks are based on the visco-elastic 
response of the crust and on delayed transfer of flu- 
ids in and out of fault structures [Hainzl et al, 1999; 
Pelletier, 2000]. 

Therefore, most of these models suggest a link 
between aftershocks and foreshocks. In the present 
work, we explore this question further by asking the 
following question: is it possible to derive most if 
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not all of the observed phenomenology of foreshocks 
from the knowledge of only the most basic and ro- 
bust facts of earthquake phenomenology, namely the 
Gutcnbcrg-Richtcr and Omori laws? To address this 
question, we use what is maybe the simplest statisti- 
cal model of seismicity, the so-called ETAS (epidemic- 
type aftershock) model, based only on the Gutenberg- 
Richter and Omori laws. This model assumes that 
each earthquake can trigger aftershocks, with a rate 
increasing as a power law E a with the mainshock 
energy E, and which decays with the time from 
the mainshock according to the "local" Omori law 
~ l/(t - t c ) 1+e , with 9 > 0. We stress that the expo- 
nent 1 + 9 is in general different from the observable 
p- value, as we shall explain below. In this model, the 
seismicity rate is the result of the whole cascade of di- 
rect and secondary aftershocks, that is, aftershocks of 
aftershocks, aftershocks of aftershocks of aftershocks, 
and so on. 

In two previous studies of this model, we have ana- 
lyzed the super-critical regime [Helmstetter and Sor- 
nette, 2002a] and the singular regime [Sornette and 
Helmstetter, 2002] of the ETAS model and have shown 
that these regimes can produce respectively an expo- 
nential or a power law acceleration of the seismicity 
rate. These results can reproduce an individual ac- 
celerating foreshock sequence, but they cannot model 
the stationary seismicity with alternative increasing 
and decreasing seismicity rate before and after a large 
earthquake. In this study, we analyze the station- 
ary sub-critical regime of this branching model and 
we show that foreshock sequences are special after- 
shock sequences which are modified by the condition 
to end up in a burst of seismicity associated with the 
mainshock. Using only the physics of aftershocks, all 
the foreshock phenomenology is derived analytically 
and verified accurately by our numerical simulations. 
This is related to but fundamentally different from 
the proposal by Jones et al. [1999] that foreshocks 
are mainshocks whose aftershocks happen to be big. 

Our analytical and numerical investigation of the 
ETAS model gives the main following results: 

• In the ETAS model, the rate of foreshocks in- 
creases before the mainshock according to the 
inverse Omori law N{t) ~ l/(t c — t) p with an 
exponent p' smaller than the exponent p of the 
direct Omori law. The exponent p' depends on 
the local Omori exponent 1 + 9, on the exponent 
j3 of the energy distribution, and on the expo- 
nent a which describes the increase in the num- 
ber of aftershocks with the mainshock energy. 



In contrast with the direct Omori law, which 
is clearly observed after all large earthquakes, 
the inverse Omori law is a statistical law, which 
is observed only when stacking many foreshock 
sequences. 

• While the number of aftershocks increases as 
the power E a of the mainshock energy E, the 
number of foreshocks of type II is independent 
of E. Thus, the seismicity generated by the 
ETAS model increases on average according to 
the inverse Omori law before any earthquake, 
whatever its magnitude. For foreshocks of type 
I, the same results hold for large mainshocks 
while the conditioning on foreshocks of type I 
to be smaller than their mainshock makes their 
number increase with E for small and interme- 
diate values of the mainshock size. 

• Conditioned on the fact that a foreshock se- 
quence leads to a burst of seismic activity ac- 
companying the mainshock, we find that the 
foreshock energy distribution is modified upon 
the approach of the mainshock, and develops 
a bump in its tail. This result may explain 
both the often reported decrease in measured b- 
value before large earthquakes and the smaller 
&-value obtained for foreshocks compared with 
other earthquakes. 

• In the ETAS model, the modification of the 
Gutenbcrg-Richter distribution for foreshocks is 
shown analytically to take the shape of an ad- 
ditive correction to the standard power law, in 
which the new term is another power law with 
exponent (3 — a. The amplitude of this addi- 
tive power law term also exhibits a kind of in- 
verse Omori law acceleration upon the approach 
to the mainshock, with a different exponent. 
These predictions are accurately substantiated 
by our numerical simulations. 

• When looking at the spatial distribution of fore- 
shocks in the ETAS model, we find that the fore- 
shocks migrate towards the mainshock as the 
time increase. This migration is driven by the 
same mechanism underlying the aftershock dif- 
fusion [Helmstetter and Sornette, 2002b]. 

Thus, the ETAS model, which is commonly used to 
describe aftershock activity, seems sufficient to ex- 
plain the main properties of foreshock behavior in the 
real seismicity. Our presentation is organized as fol- 
lows. In the next section, we define the ETAS model, 
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recall how the average rate of seismicity can be ob- 
tained formally from a Master equation and describe 
how to deal with fluctuations decorating the average 
rate. The third section provides the full derivation of 
the inverse Omori law, first starting with an intuitive 
presentation followed by a more technical description. 
Section four contains the derivation of the modifica- 
tion of the distribution of foreshock energies. Section 
5 describes the migration of foreshock activity. Sec- 
tion 6 is a discussion of how our analytical and numer- 
ical results allows us to rationalize previous empirical 
observations. In particular, we show that foreshocks 
are not just statistical creatures but are genuine fore- 
runners of large shocks that can be used to obtain 
significant prediction gains. Section 7 concludes. 

2. Definition of the ETAS model and 
its master equation for the 
renormalized Omori law 

2.1. Definitions 

The ETAS model was introduced by Kagan and 
Knopoff [1981, 1987] and Ogata [1988] to describe the 
temporal and spatial clustering of seismicity and has 
since been used by many other workers with success 
to describe real seismicity. Its value stems from the 
remarkable simplicity of its premises and the small 
number of assumptions, and from the fact that it has 
been shown to fit well to seismicity data [Ogata, 1988]. 

Contrary to the usual definition of aftershocks, the 
ETAS model does not impose an aftershock to have 
an energy smaller than the mainshock. This way, 
the same underlying physical mechanism is assumed 
to describe both foreshocks, aftershocks and main- 
shocks. The abandon of the ingrained concept (in 
many seismologists' mind) of the distinction between 
foreshocks, aftershocks and mainshocks is an impor- 
tant step towards a simplification and towards an un- 
derstanding of the mechanism underlying earthquake 
sequences. Ultimately, this parsimonious assumption 
will be validated or falsified by the comparison of its 
prediction with empirical data. In particular, the de- 
viations from the predictions derived from this as- 
sumption will provide guidelines to enrich the physics. 

In order to avoid problems arising from divergences 
associated with the proliferation of small earthquakes, 
the ETAS model assumes the existence of a magni- 
tude cut-off mo , or equivalently an energy cut-off E , 
such that only earthquakes of magnitude to > too 
are allowed to give birth to aftershocks larger than 



to , while events of smaller magnitudes are lost for 
the epidemic dynamics. We refer to [Helmstetter and 
Sornette, 2002a] for a justification of this hypothesis 
and a discussion of ways to improve this description. 

The ETAS model assumes that the seismicity rate 
(or "bare Omori propagator" ) at a time between t and 
t + dt, resulting in direct "lineage" (without interme- 
diate events) from an earthquake i that occurred at 
time ti, is given by 

4> Ei {t-U) = p(Ei) *(t-ti) , (f) 

where ty(t) is the normalized waiting time density dis- 
tribution (that we shall take later given by (4) and 
p{Ei) defined by 

p{Ei) = k {Ei/E ) a (2) 

gives the average number of daughters born from a 
mother with energy Ei > Eq. This term p(Ei) ac- 
counts for the fact that large mothers have many more 
daughters than small mothers because the larger spa- 
tial extension of their rupture triggers a larger do- 
main. Expression (2) results in a natural way from 
the assumption that aftershocks are events influenced 
by stress transfer mechanisms extending over a space 
domain proportional to the size of the mainshock 
rupture [Helmstetter, 2003]. Indeed, using the well- 
established scaling law relating the size of rupture and 
the domain extension of aftershocks [Kagan, 2002] to 
the release energy (or seismic moment), and assum- 
ing a uniform spatial distribution of aftershocks in 
their domain, expression (2) immediately follows (it 
still holds if the density of aftershocks is slowly vary- 
ing or power law decaying with the distance from the 
mainshock) . 

The value of the exponent a controls the nature 
of the seismic activity, that is, the relative role of 
small compared to large earthquakes. Few studies 
have measured a in seismicity data [Yamanaka and 
Shimazaki, 1990; Guo and Ogata, 1997; Helmstetter, 
2003]. This parameter a is often found close to the 
[3 exponent of the energy distribution defined below 
in equation (3) [e.g., Yamanaka and K. Shimazaki, 
1990] or fixed arbitrarily equal to (3 [e.g., Kagan and 
Knopoff, 1987; Reasenberg and Jones, 1989; Felzer 
et al, 2002]. For a large range of mainshock mag- 
nitudes and using a more sophisticated scaling ap- 
proach, Helmstetter [2003] found a = 0.8/3 for the 
Southern California seismicity. If a < (3, small earth- 
quakes, taken together, trigger more aftershocks than 
larger earthquakes. In contrast, large earthquakes 
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dominate earthquake triggering if a > j3. This case 
a > (3 has been studied analytically in the frame- 
work of the ETAS model by Sornette and Helmstet- 
ter [2002] and has been shown to eventually lead to 
a finite time singularity of the seismicity rate. This 
explosive regime cannot however describe a station- 
ary seismic activity. In this paper, we will therefore 
consider only the case a < (3. 

An additional space-dependence can be added to 
<t>Ei{t — ti) [Helmstetter and Sornette, 2002b]: when 
integrated over all space, the prediction of the space- 
time model retrieves those of the pure time-dependent 
model. Since we arc interested in the inverse Omori 
law for foreshocks, which is a statement describing 
only the time-dependence, it is sufficient to use the 
time-only version of the ETAS model for the theory. 

The model is complemented by the Gutenberg- 
Richter law which states that each aftershock i has 
an energy Ei > E chosen according to the density 
distribution 



P(E) 



with /3 ~ 2/3 . 



(3) 



P(E) is normalized J~ dE P(E) = 1. 

In view of the empirical observations that the ob- 
served rate of aftershocks decays as a power law of the 
time since the mainshock, it is natural to choose the 
"bare" modified Omori law (or the normalized wait- 
ing time distribution between events) — U) in (1) 
also as a power law 
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(4) 



U) is the rate of daughters of the first generation 
born at time t — U from the mother-mainshock. Here, 
c provides an "ultra-violet" cut-off which ensures the 
finitcness of the number of aftershocks at early times. 
It is important to recognize that the observed after- 
shock decay rate may be different from ^(t — ti) due 
to the effect of aftershocks of aftershocks, and so on 
[Sornette and Sornette, 1999; Helmstetter and Sor- 
nette, 2002a] 

The ETAS model is a "branching" point-process 
[Harris, 1963; Daley and Vere- Jones, 1988] controlled 
by the key parameter n defined as the average num- 
ber (or "branching ratio") of daughter-earthquakes 
created per mother-event, summed over all times and 
averaged over all possible energies. This branching 



ratio n is defined as the following integral 

oo oo 

n = j dtj dE P(E) <j) E (t) . (5) 



E Q 



The double integral in (5) converges if 9 > and 
a < (3. In this case, n has a finite value 



n 



k/3 
(3-a 



(6) 



obtained by using the separability of the two integrals 
in (5). The normal regime corresponds to the subcrit- 
ical case n < 1 for which the seismicity rate decays 
after a mainshock to a constant background (in the 
case of a steady-state source) decorated by fluctua- 
tions in the seismic rate. 

The total rate of seismicity X(t) at time t is given 

by 

X(t) = s(t)+ £ Mt-U) (7) 

i | ti<t 

where (f>Ei{t — ti) is defined by (1). The sum J2i \ t t <t 
is performed over all events that occurred at time 
ti < t, where Ei is the energy of the earthquake that 
occurred at ti. s(t) is often taken as a stationary Pois- 
son background stemming from plate tectonics and 
provides a driving source to the process. The second 
term in the right-hand-side of expression (7) is noth- 
ing but the sum of (1) over all events preceding time 
t. 

Note that there are three sources of stochasticity 
underlying the dynamics of X(t): (i) the source term 
s(t) often taken as Poissonian, (ii) the random oc- 
currences of preceding earthquakes defining the time 
sequence {ti} and (iii) the draw of the energy of each 
event according to the distribution P(E) given by (3). 
Knowing the seismic rate X(t) at time t, the time of 
the following event is then determined according to a 
non-stationary Poisson process of conditional inten- 
sity X(t), and its magnitude is chosen according to 
the Gutenberg- Richter distribution (3). 

2.2. The Master equation for the average 
seismicity rate 

It is useful to rewrite expression (7) formally as 

X(t) = s(t)+ 

t -f OO 



J dr J dE<f> E (t-T) Y 6{E - Ei) 5{t - U) , 

(8) 



-oo E 
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where S(u) is the Dirac distribution. Taking the ex- 
pectation of (8) over all possible statistical scenarios 
(so-called ensemble average), and assuming the sepa- 
rability in time and magnitude, we obtain the follow- 
ing Master equation for the first moment or statistical 
average N(t) of \(t) [Helmstetter and Sornette, 2002a] 



N(t)= p + J dr cp(t-r) N(t) , (9) 

— OO 

where \x is the expectation of the source term s(t) and 

OO 

4>{t) = JdE' P(E') <f> E ,(t) . (10) 

By virtue of (6), / °° (f>(t)dt — n. We have used the 
definitions 

N(t) = {X{t)) = (£6(t-U)), (11) 

ti<t 



and 



P(E) = (S(E Ei)) 



(12) 



where the brackets (.} denotes the ensemble average. 
The average is performed over different statistical re- 
sponses to the same source term s(t), where s(t) can 
be arbitrary. N(t)dt is the average number of events 
occurring between t and t + dt of any possible energy. 

The essential approximation used to derive (9) is 
that 

{p(Ei)5(E-Ei) 5(t-U)) = {p{Ei)S(E-Ei)) (S(t-U)) 

(13) 

in (8). In words, the fluctuations of the earthquake 
energies can be considered to be decoupled from those 
of the seismic rate. This approximation is valid for 
a < [3/2, for which the random variable p(Ei) has 
a finite variance. In this case, any coupling between 
the fluctuations of the earthquake energies and the in- 
stantaneous seismic rate provides only sub-dominant 
corrections to the equation (9). For a > (3/2, the 
variance of p(Ei) is mathematically infinite or unde- 
fined as p{Ei) is distributed according to a power law 
with exponent (3/a < 2 (see chapter 4.4 of [Sornette, 
2000]). In this case, the Master equation (9) is not 
completely correct as an additional term must be in- 
cluded to account for the dominating effect of the de- 
pendence between the fluctuations of earthquake en- 
ergies and the instantaneous seismic rate. 



Equation (9) is a linear self-consistent integral 
equation. In the presence of a stationary source of av- 
erage level p, the average seismicity in the sub-critical 
regime is therefore 



(N) = 



1 



(14) 



This result (14) shows that the effect of the cascade 
of aftershocks of aftershocks and so on is to renormal- 
ize the average background seismicity (s) to a signifi- 
cantly higher level, the closer n is to the critical value 
1. 

In order to solve for N{t) in the general case, 
it is convenient to introduce the Green function or 
"dressed propagator" K{t) defined as the solution of 
(9) for the case where the source term is a delta func- 
tion centered at the origin of time corresponding to a 
single mainshock: 



z 

K(t) = S(t) + J dr <j)(t - t) K(t) . 



(15) 



Physically, K(t) is nothing but the "renormalized" 
Omori law quantifying the fact that the event at time 
started a sequence of aftershocks which can them- 
selves trigger secondary aftershocks and so on. The 
cumulative effect of all the possible branching paths 
of activity gives rise to the net seismic activity K(t) 
triggered by the initial event at t = 0. Thus, the decay 
rate of aftershocks following a mainshock recorded in 
a given earthquake catalog is described by K{t), while 
$(t) defined by (4) is a priori unobservable (see how- 
ever [Helmstetter and Sornette, 2002a]). 

This remark is important because it turns out that 
the renormalized Omori law K(t) may be very dif- 
ferent from the bare Omori law ^(t — ti), because 
of the effect of the cascade of secondary, tertiary, 
events triggered by any single event. The behavior of 
the average renormalized Omori law K(t) has been 
fully classified in [Helmstetter and Sornette, 2002a] 
(see also [Sornette and Sornette, 1999]): with a single 
value of the exponent 1 + 6 of the "bare" propagator 
$(t) ~ l/t 1+e defined in (4), one obtains a contin- 
uum of apparent exponents for the global rate of af- 
tershocks. This result may account for the observed 
variability of Omori exponent p in the range 0.5 — 1.5 
or beyond, as reported by many workers [Utsu et al., 
1995]. Indeed, the general solution of (15) in the sub- 
critical regime n < 1 is 



K(t) 
Kit) 



l/t 1 - 6 , for c<t<t* 
l/t 1+e , for t > t* , 



(16) 
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where 

t* «c(l-n)- 1 / e . (17) 

Thus, in practice, the apparent p exponent can be 
found anywhere between 1 — 9 and 1+9. This behavior 
(16) is valid for a < f3/2 for which, as we explained 
already, the fluctuations of the earthquake energies 
can be considered to be decoupled from those of the 
seismic rate. 

In the case a > (3/2, this approximation is no more 
valid and the problem is considerably more difficult 
due to the coupling between the fluctuations in the 
sequence of earthquake energies and the seismic rate. 
We have not been able to derive the detailed solution 
of the problem in this regime but nevertheless can pre- 
dict that the apparent exponent for the dressed prop- 
agator K(t) should change continuously from 1 — 9 to 
1 + 9 as a increases towards (3 from below. The ar- 
gument goes as follows. Starting from (8), it is clear 
that the larger a is, the larger is the dependence be- 
tween the times of occurrences contributing to the 
sum over 8(t — i») and the realizations of correspond- 
ing earthquake energies contributing to the sum over 
8(E — Ei). This is due to the fact that very large 
earthquakes trigger many more aftershocks for large 
a, whose energies influence subsequently the time of 
occurrences of following earthquakes, and so on. The 
larger is the number of triggered events per shock, the 
more intrically intertwined are the times of occurrence 
and energies of subsequent earthquakes. 

We have not been able to derive a full and rigorous 
analytical treatment of this dependence, yet. Never- 
theless, it is possible to predict the major effect of this 
dependence by the following argument. Consider two 
random variables X and Y, which are (linearly) cor- 
related. Such a linear correlation is equivalent to the 
existence of a linear regression of one variable with 
respect to the other: Y — jX + x, where 7 is non- 
random and is simply related to the correlation co- 
efficient between X and Y and x is an idiosyncratic 
noise uncorrelated with X. Then, 

(XY)= 1 (X 2 ) + (X)(x) , (18) 

which means that the covariance of X and Y contains 
a term proportional to the variance of X. 

Let us now apply this simple model to the effect 
of the dependence between X = 8(t — ti) and Y = 
p(Ei)8(E — Ei) in the earthquake cascade process. 
We propose to take into account this dependence by 
the following ansatz, which corrects (13), based on 
a description capturing the dependence through the 



second-order moment, that is, their covariance: 
{p{Ei)8{E - Ei) 8(t - U)) « 

(p(Ei)8(E-Ei)) <<5(T-^)>+ 7 ( a )<[<5(T-t t )] 2 } , (19) 

where 7(a) = for a < (3/2 and increases with 
a > (3/2. The quadratic term just expresses the 
dependence between p{E{)8{E — Ei) and S(t — ti), 
i.e., p{Ei)5(E — Ei) has a contribution proportional 
to 8(t — ti) as in (18): the mechanism leading to the 
quadratic term (X 2 ) is at the source of [S(t — ti)} 2 in 

(19) . This new contribution leads to a modification 
of (15) according to 

t t 
K(t)~ J dT<f>(t-T) K(r)+j(a) J dr 0(*-t) [K{t)\ 2 . 


. (20) 

Dropping the second term in the right-hand-side of 

(20) recovers (15). Dropping the first term in the 
right-hand-side of (20) yields the announced result 
K(t) oc l/t 1+e even in the regime t < t* . We 
should thus expect a cross-over from K(t) oc l/t x ~ 6 
to K(t) cx l/t 1+6 as a increases from (3/2 to (3. This 
prediction is verified accurately by our numerical sim- 
ulations. 

Once we know the full (ensemble average) seismic 
response K(t) from a single event, the complete so- 
lution of (9) for the average seismic rate N(t) under 
the action of the general source term s(t) is 

t 

N(t) = J dr s(t) K(t - t) . (21) 

— oc 

Expression (21) is nothing but the theorem of Green 
functions for linear equations with source terms [Morse 
and Feshbach, 1953]. Expression (21) reflects the intu- 
itive fact that the total seismic activity at time t is the 
sum of the contributions of all the external sources at 
all earlier times r which convey their influence up to 
time t via the "dressed propagator" (or renormalized 
Omori law) K(t — t). K(t — r) is the relevant kernel 
quantifying the influence of each source s(r) because 
it takes into account all possible paths of seismicity 
from t to t triggered by each specific source. 

2.3. Deviations from the average seismicity 
rate 

Similarly to the definition (15) of the average 
renormalized propagator K(t), let us introduce the 
stochastic propagator n(t), defined as the solution 
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of (7) or (8) for the source term s(t) = S(t). The 
propagator n(t) is thus the seismicity rate initiated 
by a single earthquake at the origin of times, which 
takes into account the specific sequence of gener- 
ated earthquakes. Since the earthquakes are gener- 
ated according to a probabilistic (generalized Pois- 
son) process, repeating the history leads in general 
to different realizations. n(t) is thus fundamentally 
realization-specific and there are as many different 
«(t)'s as there are different earthquake sequences. In 
other words, n(t) is a stochastic function. Obviously, 
(n(t)) = K(t), that is, its ensemble average retrieves 
the average renormalized propagator. 

From the structure of (7) or (8) which are linear 
sums over events, an expression similar to (21) can 
be written for the non-average seismic rate with an 
arbitrary source term s(t): 

t 

A(t) = J dr s(t) K M (t-T) , (22) 

— oo 

where the subscript {r} in the stochastic kernel K{- T j (t— 
t) captures the fact that there is a different stochas- 
tic realization of n for each successive source. Taking 
the ensemble average of (22) recovers (21). The dif- 
ference between the stochastic kernel K{ r }(t — r), the 
local propagator 4>e{t) and the renormalized propa- 
gator K{t) is illustrated on Figure 1 for a numerical 
simulation of the ETAS model. 

We show in the Appendix A that A(t) can be ex- 
pressed as 

t 

A(t) = N(t) + J dr v (t) K(t - t) , (23) 

— oo 

where t}{t) is a stationary noise which can be suit- 
ably defined. This is the case because the fluctua- 
tions 5P{E) of the Gutenberg-Richter law and of the 
source s(t) are stationary processes, and because the 
fluctuations of 5k are proportional to K{t). The ex- 
pression of t}(t) can be determined explicitly in the 
case where the fluctuations of the energy distribution 
P{E) dominate the fluctuations of the seismicity rate 
k(t) (see Appendix A). 

3. Derivation of the inverse Omori law 
and consequences 

3.1. Synthesis of the results 

The normal regime in the ETAS model corresponds 
to the subcritical case n < 1 for which the seismicity 



rate decays on average after a mainshock to a con- 
stant background (in the case of a steady-state source) 
decorated by fluctuations. How is it then possible in 
this framework to get an accelerating seismicity pre- 
ceding a large event? Conceptually, the answer lies 
in the fact that when one defines a mainshock and 
its foreshocks, one introduces automatically a con- 
ditioning (in the sense of the theory of probability) 
in the earthquake statistics. As we shall see, this 
conditioning means that specific precursors and af- 
tershocks must precede and follow a large event. In 
other words, conditioned on the observation of a large 
event, the sequence of events preceding it cannot be 
arbitrary. We show below that it in fact follows the 
inverse Omori law in an average statistical sense. Fig- 
ure 2 presents typical realizations of foreshock and 
aftershock sequences in the ETAS model as well as 
the direct and inverse Omori law evaluated by aver- 
aging over many realizations. The deceleration of the 
aftershock activity is clearly observed for each indi- 
vidual sequence as well as in their average. Going to 
backward time to compare with foreshocks, the ac- 
celeration of aftershock seismicity when approaching 
the main event is clearly visible for each sequence. 
In contrast, the acceleration of foreshock activity (in 
forward time) is only observable for the ensemble av- 
erage while each realization exhibits large fluctuations 
with no clearly visible acceleration. This stresses the 
fact that the inverse Omori law is a statistical state- 
ment, which has a low probability to be observed in 
any specific sequence. 

Intuitively, it is clear that within the ETAS model, 
an event is more likely to occur after an increase both 
in seismicity rate and in magnitudes of the earth- 
quakes, so that this increase of seismicity can trigger 
an event with a non-negligible probability. Indeed, 
within the ETAS model, all events are the result of 
the sum of the background seismicity (due to tectonic 
forces) and of all other earthquakes that can trigger 
their aftershocks. 

How does the condition that an earthquake se- 
quence ends at a mainshock impact on the seismicity 
prior to that mainshock? How does this condition cre- 
ate the inverse Omori law? Since earthquake magni- 
tudes are independently drawn from the Gutenberg- 
Richter law, the statistical qualification of a main- 
shock, that we place without loss of generality at the 
origin of time, corresponds to imposing an anomalous 
burst of seismic activity A(0) = (N) + A at t = 
above its average level (N) given by (14). For the 
study of type II foreshocks (as defined in the intro- 
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duction), we do not constrain the mainshock to be 
larger than the seismicity before and after this main- 
shock. For large mainshock magnitudes, relaxing this 
hypothesis will not change the results derived below. 

The question then translates into what is the path 
taken by the noise t](t) in (23) for — oo < r < that 
may give rise to this burst Ao of activity. The solution 
is obtained from the key concept that the set of ??(r)'s 
for — oo < t < is biased by the existence of the con- 
ditioning, i.e., by the large value of A(0) = (N) + A 
at t = 0. This does not mean that there is an uncon- 
ditional bias. Rather, the existence of a mainshock 
requires that a specific sequence of noise realizations 
must have taken place to ensure its existence. This 
idea is similar to the well-known result that an unbi- 
ased random walk W(t) with unconditional Gaussian 
increments with zero means sees its position take a 
non-zero expectation 



(W(r))\ c = [W(t)-W(0)]- t , 



(24) 



if one knows the beginning W(0) and the end W(t) 
position of the random walk, while the unconditional 
expectation (W(t)) is identically zero. Similarly, the 
conditional increment from r to r + dr of this ran- 
dom walk become not non-zero and equal to (in non- 
rigorous notation) 



dr 



W(t) - W(0) 
t 



(25) 



in contrast with the zero value of the unconditional 
increments. 

In the ETAS model which is a marked point pro- 
cess, the main source of the noise on A(t) is com- 
ing from the "marks" , that is, the energies drawn for 
each earthquake from the Gutenberg-Richter power 
law distribution (3). Expression (2) shows that the 
amplitude r\ T of the fluctuations in the seismic rate 
is proportional to where E T is the energy of a 
mother-earthquake occurring at time r. Since the en- 
ergies are distributed according to the power law (3) 
with exponent (3, r\ T cx E^ is distributed according to 
a power law with exponent m — (3 /a (see for instance 
chapter 4.4 of [Sornette, 2000]). 

We first study the subcritical regime n < 1 for 
times t c — t < t*, where t* is defined by (17). Two 
cases must then be considered. 

• For a < {3/2, m > 2, the variance and co- 
variance of the noise rj T exist and one can use 
conditional covariances to calculate conditional 



expectations. We show below that the inverse 
Omori law takes the form 



E[A(i)|A ] 



that is, p' = 1 - 26. 



An 



(*c - t) 



1-20 



(26) 



for a > [3/2, m = [3 /a < 2 and the variance 
and covariance of r\ T do not exist: one needs a 
special treatment based on stable distributions. 
In this case, neglecting the coupling between the 
fluctuations in the earthquake energies and the 
seismic rate, we find that the inverse Omori law 
takes the form 



E[A(t)|Ao] oc 



Ao 



(t c - ty- me 



(27) 



Taking into account the dependence between 
the fluctuations in the earthquake energies and 
the seismic rate, the exponent p' progressively 
increases from 1 — 29 towards the value 1 + 8 
of the bare propagator as a goes from (3/2 to (3 
(see Figure 6). The increase of p' is thus faster 
than the dependence 1 — m8 predicted by (27). 

In the large times limit t c — t>t* (far from the main- 
shock) of the subcritical regime, we also obtain an 
inverse Omori law which takes the form 



E[A(t)|A ] cx 



Ao 



and 



E[A(t)|Ao] oc 



(tc - t) 



Ao 



1+9 



for a < (3/2 (28) 



for (3/2 < a < (3 . 



(29) 

The direct and inverse Omori laws are clearly ob- 
served in numerical simulations of the ETAS model, 
when stacking many sequences of foreshocks and af- 
tershocks, for various mainshock magnitudes (Figures 
3 and 4). Our main result shown in Figure 3 is that, 
due to conditioning, the inverse Omori law is differ- 
ent from the direct Omori law, in that the exponent 
p' of the inverse Omori law is in general smaller than 
the exponent p of the direct Omori law. Another 
fundamental difference between aftershocks and fore- 
shocks found in the ETAS model is that the number 
of aftershocks increases as a power E a of the main- 
shock energy E as given by (2), whereas the number 
of foreshocks of type II is independent of the main- 
shock energy (see Figures 3 and 4). Because in the 
ETAS model the magnitude of each event is inde- 
pendent of the magnitude of the triggering events, 
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and of the previous seismicity, the rate of seismicity 
increases on average according to the inverse Omori 
law before any earthquake, whatever its magnitude. 
The number of foreshocks of type I increases with 
the mainshock magnitude, for small and intermedi- 
ate mainshock magnitudes and saturates to the level 
of foreshocks of type II for large mainshocks because 
the selection/condition acting of those defined fore- 
shocks becomes less and less severe as the magnitude 
of the mainshock increases (see Figure 5). The con- 
ditioning that foreshocks of type I must be smaller 
than their mainshock induces an apparent increase 
of the Omori exponent p' as the mainshock magni- 
tude decreases. The predictions (16) and (26) on the 
p and p'-value of type II foreshocks are well- verified 
by numerical simulations of the ETAS model up to 
a/ (3 = 0.5, as presented on Figure 6. However, for 
a/ (3 > 0.5, both p and p' are found larger than pre- 
dicted by (16) and (27) respectively, due to the cou- 
pling between the fluctuations in the earthquake en- 
ergies and those of the seismic rate. This coupling 
occurs because the variance of the number p(E) of 
direct aftershocks of an earthquake of energy E is un- 
bounded for a > [3/2, leading to strong burst of seis- 
mic activity coupled with strong fluctuations of the 
earthquake energies. In this regime, expression (20) 
shows that p changes continuously between 1 — 9 for 
a//3 = 0.5tol + #fora = /3in good agreement with 
the results of the numerical simulations. In this case 
a > /?/2, the exponent p' is also observed to increase 
between p' = 1 - 29 for a = (3/2 to p' = 1 + 9 for 
a = (3, as predicted below. 

The dissymetry between the inverse Omori law for 
foreshocks and the direct Omori law (16) for after- 
shocks stems from the fact that, for foreshocks, one 
observes a seismic rate conditional on a large rate 
at the time t c of the mainshock while, for the after- 
shocks, one observes the direct response K (t) to a sin- 
gle large shock. The later effect stems from the term 
p{E) given by (2) in the bare Omori propagator which 
ensures that a mainshock with a large magnitude trig- 
gers aftershocks which dominates overwhelmingly the 
seismic activity. In the special case where one take 
the exponent a — in (2), a mainshock of large mag- 
nitude has no more daughters than any other earth- 
quake. As a consequence, the observed Omori law 
stems from the same mechanism as for the foreshock 
and the increasing foreshock activity (26) gives the 
same parametric form for the aftershock decay, with 
t c — t replaced by t — t c (this is for instance obtained 
through the Laplace transform of the seismic rate). 



This gives the exponent p = p' = 1 — 29 for a = as 
for the foreshocks, but the number of aftershocks is 
still larger than the number of foreshocks. This result 
is born out by our numerical simulations (not shown). 

These results and the derivations of the inverse 
Omori law make clear that mainshocks are more than 
just the aftershocks of their foreshocks, as sometimes 
suggested [Shaw, 1993; Jones et al, 1999]. The key 
concept is that all earthquakes are preceded by some 
seismic activity and may be seen as the result of this 
seismic activity. However, on average, this seismic 
activity must increase to be compatible statistically 
with the occurrence of the main shock: this is an un- 
avoidable statistical certainty with the ETAS model, 
that we derive below. The inverse Omori law is fun- 
damentally a conditional statistical law which derives 
from a double rcnormalization process: (1) the renor- 
malization from the bare Omori propagator ^(t) de- 
fined by (4) into the renormalized or dressed propaga- 
tor K{t) and (2) the conditioning of the fluctuations 
in seismic activity upon a large seismic activity asso- 
ciated with the mainshock. In summary, we can state 
that mainshocks are aftershocks of conditional fore- 
shocks. We stress again that the statistical nature 
of foreshocks does not imply that there is no infor- 
mation in foreshocks on future large earthquakes. As 
discussed below, in the ETAS model, foreshocks are 
forerunners of large shocks. 

3.2. The inverse Omori law ~ 1/t 1 20 for 

a < (3/2 

Let us call X(t) = X(t) - N(t) given by (23) and 
Y = A(0) — N(0). It is a standard result of stochastic 
processes with finite variance and covariance that the 
expectation of X(t) conditioned on Y = A is given 
by [Jacod and Shiryaev, 1987] 

E[ W = Ao] = Ao^pp, (30) 

where E[Y" 2 ] denotes the expectation of Y 2 and Cov(X(t) 
is the covariance of X and Y. Expression (30) recov- 
ers the obvious result that E[X(£)|Y = A ] = if X 
and Y are uncorrelated. 

Using the form (23) for X(t) = X(t)-N(t) and the 
fact that X has a finite variance, we obtain 

t o 

Cov(X(t),Y)= J dr J dr' K(t — t) K(-t) 

— OO —oo 



CovMtMt')]. (31) 
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For a dependence structure of rj(t) falling much faster 
than the kernel K(t), the leading behavior of Cov(X(t), Y) 
is obtained by taking the limit Cov(?7(t), 77(1"')) = 
5{t-t'). This yields 

t 

Cov(X(t),Y) = J dr K(t - t) K(-t) , (32) 

—00 

and 



W 2 \ = I dr [K{-r)f . (33) 

—00 

E[Y 2 ] is thus a constant while, for \t\ < t* where t* is 
defined in (17), Gov{X(t),Y) ~ \l\t\ 1 - 20 . Generaliz- 
ing to a mainshock occurring at an arbitrary time t c , 
this yields the inverse Omori law 

E[A(t)|A(i c ) = (N) + Ac] = (N) + C {tc _ X ° y _ 2g , 

(34) 

where C is a positive numerical constant. 

Expression (34) predicts an inverse Omori law for 
forcshocks in the form of an average acceleration of 
seismicity proportional to 1/ (t c — t) p with the inverse 
Omori exponent p' = 1 — 29, prior to a mainshock. 
This exponent p' is smaller than the exponent p = 
1 — 9 of the renormalized propagator K(t) describing 
the direct Omori law for aftershocks. This prediction 
is well-verified by numerical simulations of the ETAS 
model shown in Figure 3. 

As we pointed out in the introduction, Shaw [1993] 
derived the relationship p' = 2p — 1, which yields 
p' = 1 — 29 for p = 1 — 9, based on a clever inter- 
estingly incorrect reasoning that we now clarify. Ac- 
tually, there are two ways of viewing his argument. 
The most straightforward one used by Shaw himself 
consists in considering a single aftershock sequence 
starting at time from a large mainshock. Let us 
consider two aftershocks at time t — r and t. Forget- 
ting any constraint on the energies, the earthquake 
at time t — r can be viewed as a foreshock of the 
earthquake at time t. Summing over all possible po- 
sitions of these two earthquakes at fixed time separa- 
tion r then amounts to constructing a kind of fore- 
shock count which obeys the equation 

+ OC 

J dt K(t - t) K(t) , (35) 


where K(t) is the number of aftershocks at time t. 
This integral (35) recovers equation (12) of [Shaw, 



1993]. If K(t) ~ 1/tP, this integral predicts a de- 
pendence l/r 2p_1 for the effective foreshock activity. 
This derivation shows that the prediction p' = 2p — 1 
results solely from the counting of pairs at fixed time 
intervals in an aftershock sequence. It is a pure prod- 
uct of the counting process. 

We can also view this result from the point of view 
of the ETAS model. In the language of the ETAS 
model, Shaw's formula (12) uses the concept that a 
mainshock is an aftershock of a cascade of aftershocks, 
themselves deriving from an initial event. This idea 
implies that the probability for a mainshock to occur 
is the sum over all possible time occurrences of the 
product of (i) the probability for an aftershock to be 
triggered by the initial event and (ii) the probability 
that this aftershock triggers its own aftershock at the 
time of the mainshock. Shaw uses (what corresponds 
to) the dressed propagator K(t) for the first proba- 
bility. He also assumes that the rate of mainshocks 
deriving from an aftershock of the initial event is pro- 
portional to K{t). However, from our previous studies 
[Sornette and Sornette, 1999; Helmstetter and Sor- 
nette, 2002a] and the present work, one can see that 
this corresponds to an illicit double counting or dou- 
ble renormalization. This danger of double counting 
is illustrated by comparing the formulas (9, 15) with 
(21). Either the direct tectonic source of seismicity 
s(t) impacts the future seismicity by a weight given by 
the renormalized or dressed propagator as in (21). Or 
we can forget about the tectonic source term s(t), we 
only record all past seismic activity (all sources and 
all triggered events) as in (9, 15), but then the impact 
of all past seismicity on future seismicity is weighted 
by the bare propagator. These two view points are 
completely equivalent and are two alternative expres- 
sions of the Green theorem. What is then the reason 
for the correct l/t 1 ^ 26 inverse Omori law derived by 
Shaw [1993]? It turns out that his (erroneous) dou- 
ble counting recovers the mathematical form resulting 
from the effect of the conditioning of past seismicity 
leading to s(t) ~ K(t) valid for a < 13/2 as derived 
below in (39). Indeed, inserting s{t) ~ K{t) in (21) 
retrieves the correct prediction l/i 1-20 for the inverse 
Omori law. This proportionality s(t) ~ K(t) is phys- 
ically at the origin of (32) at the origin itself of the 
inverse Omori law. In other words, Shaw obtains the 
correct result (35) by incorrect double counting while 
the correct way to get (35) is that the mainshock is 
conditional on a specific average trajectory of past 
seismicity captured by s(t) ~ K(t). In addition to 
provide a more correct reasoning, our approach al- 
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lows one to explore the role of different parameter 
regimes and, in particular, to analyze the failure of 
the argument for a > (3/2, as already explained. 

3.3. The inverse Omori law ~ 1/t 1 e/3 / a for 
a > (3/2 

Expression (23) defines the fluctuating part X{t) = 
\{t) — N(t) of the seismic rate as a sum of random 
variables 7](t) with power law distributions weighted 
by the kernel K(t — r). These random variables t](t), 
which are mainly dominated by the fluctuations in 
event magnitudes but also receive contributions from 
the intermittent seismic rate, are conditioned by the 
realization of a large seismicity rate 



u 

X(0) = A = J dr v {t) K 



(-T) 



(36) 



which is the correct statistical implementation of the 
condition of the existence of a large shock at t = 0. 
Since the conditioning is performed on -X"(0), that is, 
upon the full set of noise realizations acting up to 
time t = 0, the corresponding conditional noises up 
to time t < contribute all to E[X(t)\X(0) = A ] t<0 
by their conditional expectations as 

* 

E[X(i)|X(0) = A ] t <o= J dTE[ V (T)\X(0)}K(t-T). 

— oo 

(37) 

In Appendix B, it is shown that, for identically 
independently distributed random variables Xi dis- 
tributed according to a power law with exponent 
m = (3/a < 2 and entering the sum 



N 



(38) 



where the Ki are arbitrary positive weights, the ex- 
pectation E[xj|5iv] of Xi conditioned on the existence 
of a large realization of Sn is given by 

-l 



E[ Xl \S N ] ocS N Kl 



(39) 



To apply this result to (37), it is convenient to 
discrctizc it. Some care should however be exercised 
in this discretization (1) to account for the expected 
power law acceleration of E[A(i)|AT(0) = A ] up to 
t = and (2) to discretize correctly the random noise. 
We thus write 

o n+i 

f dr 7](t) K{-t) ~J2f dT K (- T ) 



Ti<0 



Tj) K(-Ti) X l 



(40) 



where Xi ~ rji(T i+ i—Ti) is the stationary discrete noise 
distributed according to a power law distribution with 
exponent m = (3 /a. The factor (r^+i — r,) oc \n\ in 
front of the kernel K(—Ti) is needed to regularize the 
discretization in the presence of the power law accel- 
eration up to time 0. In the notation of Appendix 
B, (r i+ i - n) K(-n) oc \n\ K(-n) - l/\n\- e plays 
the role of K t . We also need an additional factor 
(r i+ i — to obtain a regularized noise term: thus, 
rji(Ti + i — Ti) oc 7]i\ri\ plays the role of Xi. This dis- 
cretization procedure recovers the results obtained by 
using (30) and the variance and covariance of the con- 
tinuous integrals for the case a < (3/2 where they are 
defined. Note that the last expression in equation 
(40) does not keep track of the dimensions as we are 
only able to obtain the leading scaling behavior in the 
discretization scheme. 

Using (39), we thus obtain E^Til |^(0) = Aq] oc 



Inl- 



and thus 



E[^|A(0) - A ] cx 



A 



| T .|l-0(m-l) 



(41) 



Similarly to (40), the discrete equivalent to (37) 
reads 

E[X(t)|A(0) = A ] t <o (42) 



Ti<t 
t 



/dr — - 
li-rl 1 " 



A 



A 



| r ll-0(m-l) |^|l-me ' 



where we have re- introduced the factors r i+1 — Tj to re- 
verse to the continuous integral formulation and have 
use the definition m — (3/a. Expression (42) gives the 
inverse Omori law 



E[X(t)\X(t c ) = A ] t <o cx 



A 



(t c -ty- B P/ a 



(43) 



for foreshock activity prior to a mainshock occurring 
at time t c . Note that the border case m — (3/a = 2 
recovers our previous result (34) as it should. 

The problem is that this derivation does not take 
into account the dependence between the fluctuations 
in the earthquake energies and the seismic rate, which 
become prominent precisely in this regime (3/2 < a < 
(3. We have not been able yet to fully solve this prob- 
lem for arbitrary values a but can nevertheless predict 
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that (43) must be replaced by 



E[X(t)\X(0) = \ } t <o oc 



A 



1*1 



1+0 



, for a -> (3 . (44) 



We follow step by step the reasoning from expression 
(37) to (42), with the following modifications imposed 
by the regime (3/2 < a < (3. 

1. The conditional expectations given by (39) must 
be progressively changed into E [xi \ Sn] oc Sn Ki 
as a — > (3, due to the coupling between energy 
and seismic rate fluctuations (leading to (19) via 
the mechanism (18)). Indeed, the coupling be- 
tween energy and seismic rate fluctuations gives 
rise to the dependence E^ilSW] oc Ki which 
becomes dominant over the conditional expec- 
tations given by (39) for m < 2. 

2. As shown with (20), the dependence between 
the fluctuations in the earthquake energies and 
the seismic rate leads to change K(t) oc l/i 1-8 
into K(t) oc l/t 1+e as a — > (3 even in the regime 
t < t*. 

This leads finally to changing expressing (42) into 



t 

E[X(t)\X(0)] ~ Jdr |t _ T ^ c|1+g ^ , 



(45) 



by the factor t 6 stemming from the regularization c. 
Thus, in the large time limit t c —t > t* (far from the 
mainshock) of the subcritical regime, we also obtain 
an inverse Omori law which takes the form (28) for 
a < j3/2 and the form (29) for (3/2 < a < (3. These 
predictions arc in good agreement with our numerical 
simulations. 

4. Prediction for the 
Gutenberg-Richter distribution of 
foreshocks 

We have just shown that the stochastic component 
of the seismic rate can be formulated as a sum of the 
form (38) of variables Xi distributed according to a 
power law with exponent m = (3/a and weight Ki. It 
is possible to go beyond the derivation of the condi- 
tional expectation E^ilSV] given by (39) and obtain 
the conditional distribution p{xi\Sn) conditioned on 
a large value of the realization of Sn- 

For this, we use the definition of conditional prob- 
abilities 

p{S N \xi)p{xi) 



p(xi\S N ) = 



Pn(Sn) 



(46) 



where Pn(Sn) is the probability density function of 
the sum Sn- Since p(Sn\xz) is simply given by 



p(S N \xi) = P N -i(S N - KiXi) 



(47) 



where we have re-introduced the regularization con- 
stant c to ensure convergence for t — > t. Taking into 
account the contribution oc t e at this upper bound t 
of the integrand oc l/\t — r + c| 1+e , we finally get (44). 
This result is verified numerically in Figure 6. 

3.4. The inverse Omori law in the regime 

t c - t > t* 

The inverse Omori laws derived in the two preced- 
ing sections are valid for t c — t < t* , that is, sufficiently 
close to the mainshock. A similar inverse Omori law is 
also obtained for t c — t>t*. In this goal, we use (16) 
showing that the propagator K(t — r) oc l/(t — t) 1 ^ 8 
must be replaced by K(t — r) oc l/(t — t) 1+0 for time 
difference larger than t*. It would however be incor- 
rect to deduce that we just have to change —9 into 
+9 in expressions (34) and (43), because the integrals 
leading to these results behave differently: as in (45), 
one has to re-introduced the regularization constant c 
to ensure convergence for r — > t of l/|t— r+c| 1+e . The 
final results are thus given by (34) and (43) by chang- 
ing —9 into +9 and by multiplying these expressions 



we obtain 



p(xi\S N ) = p{xi) 



P N -i{S N ~ KjXj) 
Pn(Sn) 



(48) 



This shows that the conditional Gutenberg-Richter 
distribution p(xi\SN) is modified by the condition- 
ing according to the multiplicative correcting factor 
Pn-i(S n - KiXi)/P N {S N ). For large N, P N and 
Pn-i tend to stable Levy distributions with the same 
index m but different scale factors equal respectively 
to Ej K T and Ej^i K T- The tail of p(xi\S N ) is thus 



p{xi\S N ) 



KJ 



1 



V Ej K? j x\ +m (1 - (K iXi /S N )) 1+m ' 

(49) 

Since KiXi -C Sn, we can expand the last term in the 
right-hand-side of (49) and obtain 



p(xi\S N ) 



KJ 



E,*7 



i 



+ {l + m){K i /S N ) : 
(50) 
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Since Xi ~ Ef, we use the transformation prop- 
erty of distribution functions p(xi)dXi — p(Ei)dEi 
to obtain the pdf of foreshock energies Go- 
ing back to the continuous limit in which Ki/Sw ~ 
(t c -t)-^- 9S > 1(1,-1)-^-^ = l/( tc -t)^ e , we ob- 
tain the conditional Gutenberg-Richter distribution 
for foreshocks 

P(E\\ ) ~ + _ _ ° Hfj _ a)/a (51) 

where 

/?' = /?- a , (52) 

and C is a numerical constant. The remarkable pre- 
diction (51) with (52) is that the Gutenberg-Richter 
distribution is modified upon the approach of a main- 
shock by developing a bump in its tail. This modifi- 
cation takes the form of an additive power law contri- 
bution with a new "&-value" renormalized/ amplified 
by the exponent a quantifying the dependence of the 
number of daughters as a function of the energy of 
the mother. Our prediction is validated very clearly 
by numerical simulations reported in Figures 7 and 8. 

5. Migration of foreshocks towards the 
mainshock 

By the same mechanism leading to (34) via (30) 
and (32), conditioning the foreshock seismicity to cul- 
minate at a mainshock at time t c at some point r 
taken as the origin of space must lead to a migration 
towards the mainshock. The seismic rate A(r, t) at 
position r at time t < t c conditioned on the existence 
of the mainshock at position at time t c is given by 

t 

E[A(f,*)|A(0,* c )] ~ J dr f dpK(f-p,t-T)K(p,t c -T). 

— CO 

(53) 

K(f—p,t—T) is the dressed spatio-temporal propaga- 
tor giving the seismic activity at position r and time t 
resulting from a triggering earthquake that occurred 
at position p at a time r in the past. Its expression is 
given in [Helmstetter and Sornette, 2002b] in a variety 
of situations. Assuming that the probability distribu- 
tion for an earthquake to trigger an aftershock at a 
distance r is of the form 

p{r) ~ l/(r + d) 1+ » , (54) 

Helmstetter and Sornette [2002b] have shown that the 
characteristic size of the aftershock area slowly dif- 
fuses according to R ~ t H , where the time t is counted 



from the time of the mainshock. For simplicity, d is 
taken independent of the mainshock energy. H is the 
Hurst exponent characterizing the diffusion given by 

9 9 
H = - for p < 2 , H = - for p > 2 . (55) 

p A 

This diffusion is captured by the fact that K(r — p, t — 
t) depends on f — p and t — r essentially through the 
reduced variable \f — p\/(t — t) h . Then, expression 
(53) predicts that this diffusion must be reflected into 
an inward migration of foreshock seismicity towards 
the mainshock with the same exponent H. 

These results are verified by numerical simulations 
of the ETAS model. Figure 9 presents the migra- 
tion of foreshock activity for two numerical simula- 
tions of the ETAS model, with different parameters. 
As for the inverse Omori law, we have superposed 
many sequences of foreshock activity to observe the 
migration of foreshocks. For a numerical simulation 
with parameters n = 1, 9 = 0.2, p = 1, d = 1 km, 
c = 0.001 day, a = 0.5/3 and too = 2, we see clearly 
the localization of the seismicity as the mainshock ap- 
proaches. We obtain an effective migration exponent 
H = 0.18, describing how the effective size R of the 
cloud of foreshocks shrinks as time t approaches the 
time t c of the main shock: R <~ (t c — t) H (see Figure 
9a, c). This result is in good agreement with the pre- 
diction H = 0.2 given by (55). The spatial distribu- 
tion of foreshocks around the mainshock is similar to 
the distribution of aftershocks around the mainshock. 
Figure 9b, c presents the migration of foreshock activ- 
ity for a numerical simulation with 6 = 0.01, p = 1, 
d = 1 km, c = 0.001 day leading to a very small 
diffusion exponent H = 0.01. The analysis of this 
foreshock sequence gives an effective migration expo- 
nent H = 0.04 for short times, and a faster apparent 
migration at longer times due to the influence of the 
background activity. See [Helmstetter and Sornette, 
2002b] for a discussion of artifacts leading to apparent 
diffusions of seismicity resulting from various cross- 
over phenomena. 

6. Discussion 

It has been proposed for decades that many large 
earthquakes were preceded by an unusually high seis- 
micity rate, for times of the order of weeks to months 
before the mainshock [Omori, 1908; Richter, 1958; 
Mogi, 1963]. Although there are large fluctuations in 
the foreshock patterns from one sequence to another 
one, some recurrent properties are observed. 



17 



(i) The rate of foreshocks increases as l/(t c — t) p 
as a function of the time to the main shock at 
t c , with an exponent p' smaller than or equal to 
the exponent p of direct Omori law; 

(ii) the Gutenbcrg-Richter distribution of magni- 
tudes is modified as the mainshock approaches, 
and is usually modeled by a decrease in b- value; 

(iii) The epicenters of the foreshocks seem to migrate 
towards the mainshock. 

We must acknowledge that the robustness of these 
three laws decreases from (i) to (iii). In previous sec- 
tions, we have shown that these properties of fore- 
shocks derive simply from the two most robust em- 
pirical laws of earthquake occurrence, namely the 
Gutenbcrg-Richter and Omori laws, which define the 
ETAS model. In this ETAS framework, foreshock se- 
quences emerge on average by conditioning seismicity 
to lead to a burst of seismicity at the time of the main- 
shock. This analysis differs from two others analytical 
studies of the ETAS model [Helmstetter and Sornette, 
2002b; Sornette and Helmstetter, 2002] , who proposed 
that accelerating foreshock sequences may be related 
either to the super-critical regime n > 1 or to the 
singular regime a > (3 (leading formally to n — > oo) 
of the ETAS model. In these two regimes, an accel- 
erating seismicity sequence arises from the cascade of 
aftershocks that trigger on average more than one af- 
tershock per earthquake. Here we show that foreshock 
sequences emerge in the stationary sub-critical regime 
(n < 1) of the ETAS model, when an event triggers 
on average less than one aftershock. In this regime, 
aftershock have a low probability of triggering a larger 
earthquake. Nonetheless, conditioning on a high seis- 
micity rate at the time of the mainshock, we observe, 
averaging over many mainshocks, an increase of the 
seismicity rate following the inverse Omori law. In 
addition, as we shall show below, this increase of seis- 
micity has a genuine and significant predictive power. 

6.1. Difference between type I and type II 
foreshocks 

Our results applies to foreshocks of type II, defined 
as earthquakes preceding a mainshock in a space- 
time window preceding a mainshock, independently of 
their magnitude. This definition is different from the 
usual definition of foreshocks, which imposes a main- 
shock to be larger than the foreshocks (foreshocks 
of type I in our terminology). Using the usual def- 
inition of foreshocks in our numerical simulations of 



the ETAS model, our results remain robust but there 
are quantitative differences introduced by the some- 
what arbitrary constraint entering into the definition 
of foreshocks of type I: 

1. a roll-off in the inverse Omori-law, 

2. a dependence of the apparent exponent p' on 
the time window used to define foreshocks and 
mainshocks and 

3. a dependence of the rate of foreshocks and of p' 
on the mainshock magnitude. 

As seen in Figure 5, these variations between fore- 
shocks of type I and type II are observed only for 
small mainshocks. Such foreshocks are less likely the 
foreshocks of a mainshock and are more likely to be 
preceded by a larger earthquake, that is, to be the 
aftershocks of a large preceding mainshock. These 
subtle distinctions should attract the attention of the 
reader on the arbitrariness underlying the definition 
of foreshocks of type I and suggest, together with our 
results, that foreshocks of type II are more natural 
objects to define and study in real catalogs. This will 
be reported in a separate presentation. 

6.2. Inverse Omori law 

Conditioned on the fact that a mainshock is associ- 
ated with a burst of seismicity, the inverse Omori law 
arises from the expected fluctuations of the seismicity 
rate leading to this burst of seismicity. Depending on 
the branching ratio n and on the ratio a/ (3, the ex- 
ponent p' is found to vary between 1 — 29 and 1 + 9, 
but is always found to be smaller than the p exponent 
of the direct Omori law. Our results thus reproduce 
both the variability of p' and the lower value mea- 
sured for p' than for p reported by [Papazachos, 1973, 
1975b; Page, 1986; Kagan and Knopoff, 1978; Jones 
and Molnar, 1979; Davis and Frohlich, 1991; Shaw, 
1993; Utsu et al, 1995; Ogata et al, 1995; Maeda, 
1999]. In their synthesis of all p and p' values, Utsu 
et al. [1995] report p'- value in the range 0.7-1.3 , 
while p of aftershocks ranges from 0.9 to 1.5. The few 
studies that have measured simultaneously p and p' 
using a superposed epoch analysis have obtained p' 
either roughly equal to p [Kagan and Knopoff, 1978; 
Shaw, 1993] or smaller than p [Davis and Frohlich, 
1991; Ogata et al, 1995; Maeda, 1999]. The finding 
that p ~ p' ~ 1 suggested by [Shaw, 1993; Reasen- 
berg, 1999] for the California seismicity can be inter- 
preted in our framework as either due to a very small 



18 



9 value, or due to a large a/(3 ratio close to 0.8, as 
shown in Figures 4 and 6. The result p' < p reported 
by [Maeda, 1999] for the Japanese seismicity and by 
Davis and Frohlich [1991] for the worldwide seismic- 
ity can be related to a rather small a/ (3 ratio, as also 
illustrated in Figures 3 and 6. 

In contrast with the direct Omori law, which is 
clearly observed after all large shallow earthquakes, 
the inverse Omori law is an average statistical law, 
which is observed only when stacking many foreshock 
sequences. Simulations reported in Figure 2 illustrate 
that, for individual foreshock sequences, the inverse 
Omori law is difficult to capture. Similarly to what 
was done for real data [Kagan and Knopoff, 1978; 
Jones and Molnar, 1979; Davis and Frohlich, 1991; 
Shaw, 1993; Ogata et at, 1995; Maeda, 1999; Reasen- 
berg, 1999], the inverse Omori law emerges clearly 
in our model only when using a superposed epoch 
analysis to average the seismicity rate over a large 
number of sequences. Our results are thus fundamen- 
tally different from the critical point theory [Sammis 
and Sornette, 2002] which leads to a power-law in- 
crease of seismic activity preceding each single large 
earthquake over what is probably a larger space-time 
domain [Keilis-Borok and Malinovskaya, 1964; Bow- 
man et al, 1998]. The inverse Omori law is indeed 
usually observed for time scales of the order of weeks 
to months before a mainshock, while Keilis-Borok and 
Malinovskaya [1964] and Bowman et al [1998] report a 
precursory increase of seismic activity for time scales 
of years to decades before large earthquakes. Our 
results can thus be considered as providing a null- 
hypothesis against which to test the critical point the- 
ory. 

6.3. Foreshock occurrence rate 

In term of occurrence rate, foreshocks are less fre- 
quent than aftershocks (e.g. [Kagan and Knopoff, 
1976, 1978; Jones and Molnar, 1979]). The ratio of 
foreshock to aftershock numbers is close to 2-4 for 
M = 5 — 7 mainshocks, when selecting foreshocks 
and aftershocks at a distance R = 50 — 500 km from 
the mainshock and for a time T = 10 — 100 days 
before or after the mainshock [Kagan and Knopoff 
1976; 1978; Jones and Molnar, 1979; von Seggern et 
al, 1981; Shaw, 1993]. In our simulations, large main- 
shocks have significantly more aftershocks than fore- 
shocks, in agreement with observations, while small 
earthquakes have roughly the same number of fore- 
shocks (of type II) and of aftershocks. The ratio of 
aftershocks to foreshock of type II increases if the ra- 



tio a/p decreases, as observed when comparing the 
case a — 0.5/3 shown in Figure 3 with the results 
obtained in the case a — 0.8/3 represented in Fig- 
ure 4. This may be explained by the relatively larger 
weights of the largest earthquakes which increase with 
increasing a, and by our definition of aftershocks and 
foreshocks: recall that aftershock sequences are condi- 
tioned on not being preceded by an event larger than 
the mainshock, whereas a foreshock of type II can be 
larger than the mainshock. Thus, for large a/(3 < 1, 
most "mainshocks" , according to our definition, are 
aftershocks of a preceding large earthquake, whereas 
aftershock sequences cannot be preceded by an earth- 
quake larger than the mainshock. 

The retrospective foreshock frequency, that is, the 
fraction of mainshocks that are preceded by a fore- 
shock, is reported to range from 10% to 40% using 
either regional or worldwide catalogs [Jones and Mol- 
nar, 1979; von Seggern et al, 1981; Yamashina, 1981; 
Console et al., 1983; Jones, 1984; Agnew and Jones, 
1991; Lindh and Lim, 1995; Abercrombie and Mori, 
1996; Michael and Jones, 1998; Reasenberg, 1999]. 
The variability of the foreshock rate is closely related 
to the catalog threshold for the magnitude complete- 
ness for the small events [Reasenberg, 1999]. These 
results arc in line with our simulations. 

The observed number of foreshocks per mainshock 
slowly increases with the mainshock magnitude [e.g. 
data from Kagan and Knopoff, 1978; Shaw, 1993; 
Reasenberg, 1999]. In our model, the number of fore- 
shocks of type II is independent of the mainshock 
magnitude, because the magnitude of each earth- 
quake is independent of the previous seismicity his- 
tory. An increase of the number of foreshocks of type I 
as a function of the mainshock magnitude is observed 
in our numerical simulations (see Figure 5) because, 
as we explained before, the constraint on the fore- 
shock magnitudes to be smaller than the mainshock 
magnitude is less severe for larger earthquakes and 
thus filter out less foreshocks. Therefore, our results 
can explain the increase in the foreshock frequency 
with the mainshock magnitude reported using fore- 
shocks of type I. The slow increase of the number of 
foreshocks with the mainshock magnitude, if any, is 
different from the predictions of both the nucleation 
model [Dodge et al., 1996] and of the critical point 
theory [Sammis and Sornette, 2002] which predict an 
increase of the foreshocks rate and of the foreshock 
zone with the mainshock size. 
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6.4. Magnitude distribution of foreshocks 

Many studies have found that the apparent b- value 
of the magnitude distribution of foreshocks is smaller 
than that of the magnitude distribution of the back- 
ground seismicity and of aftershocks. Case histo- 
ries analyze individual foreshock sequences, most of 
them being chosen a posteriori to suggest that fore- 
shock patterns observed in acoustic emissions preced- 
ing rupture in the laboratory could apply to earth- 
quakes [Mogi, 1963; 1967]. A few statistical tests vali- 
date the significance of reported anomalies on b- value 
of foreshocks. A few others studies use a stacking 
method to average over many sequences in order to 
increase the number of events. 

A 6-value anomaly, usually a change in the mean 
6-value, for earthquakes preceding a mainshock has 
been proposed as a possible precursor on many retro- 
spective case studies [Suyehiro, 1966; Papazachos et 
al, 1967; Ikegami, 1967; Berg, 1968; Bufe, 1970; Fe- 
dotov et al, 1972; Wyss and Lee, 1973; Papazachos, 
1975a,b; Ma, 1978; Li et at, 1978; Wu et al. 1978; 
Cagnetti and Pasquale, 1979; Stephens et al., 1980; 
Smith, 1981, 1986; Imoto 1991; Enescu and Kito, 
2001]. Most case histories argue for a decrease of b- 
value, but this decrease, if any, is sometimes preceded 
by an increase of b- value [Ma, 1978; Smith, 1981, 1986; 
Imoto 1991]. In a couple of cases, temporal decreases 
in 6-value before Chinese earthquakes were used to 
issue successful predictions [Wu et al, 1978; Zhang et 
al, 1999]. 

Because of the paucity of the foreshock numbers, 
most of the study of individual sequences does not 
allow to estimate a robust temporal change of b- 
values before mainshocks, nor to characterize the 
shape of the magnitude distribution. A few studies 
have demonstrated the statistical significance of de- 
creases of &-value when the time to the mainshock 
decreases using a superposed epoch analysis [Kagan 
and Knopoff, 1978; Molchan and Dmitrieva, 1990; 
Molchan et al, 1999]. Using 200 foreshocks sequences 
of regional and worldwide seismicity, Molchan et al 
[1999] found that the 6-value is divided by a factor 
approximately equal to 2 a few days or hours before 
the mainshock. Knopoff et al [1982] found no sig- 
nificant differences between the b- value of aftershocks 
and foreshocks when investigating 12 individual se- 
quences of California catalogs. When all the after- 
shocks and foreshocks in a given catalog are super- 
posed, the same study showed for catalogs of large du- 
rations (e.g. ISC, 1964-1977; NOAA, 1965-1977) that 
the b- value for foreshocks is significantly smaller than 



the b- value for aftershocks [Knopoff et al, 1982]. The 
same pattern being simulated by a branching model 
for seismicity, Knopoff et al. [1982] surmise that the 
observed and simulated changes in magnitude distri- 
bution value arises intrinsically from the conditioning 
of aftershocks and foreshocks and from the smaller 
numbers of foreshocks relatively to aftershocks num- 
bers when counted from the mainshock time. The 
result of [Knopoff et al, 1982] is often cited as dis- 
proving the reality of a change of b- value. Our results 
find that a change in 6-value in the ETAS branching 
model of seismicity is a physical phenomenon with 
real precursory content. This shall be stressed fur- 
ther below in association with Figure 10. Therefore, 
the fact that a change in 6-value can be reproduced 
by a branching model of seismicity cannot discredit 
the strong empirical evidence of a change of b- value 
[Knopoff et al, 1982] and its genuine physical content 
capturing the interactions between and triggering of 
earthquakes. 

The observed modification of the magnitude dis- 
tribution of foreshocks is usually interpreted as a 
decrease of 6-value as the mainshock approaches. 
However, some studies argue that the Gutenberg- 
Richter distribution before a mainshock is no more 
a pure power-law distribution, due to an apparent in- 
crease of the number of large events relatively to the 
Gutenberg-Rlchter law, while the rate of small earth- 
quakes remains constant. Such pattern is suggested 
by Rotwain et al. [1997] for both acoustic emission 
preceding material failure, and possibly for Califor- 
nian seismicity preceding large earthquakes. Analysis 
of seismicity before recent large shocks also argue for 
an increase in the rate of moderate and large earth- 
quakes before a mainshock [Jaume and Sykes, 1999]. 
Knopoff et al. [1982] also suspected a deviation from a 
linear Gutenberg-Rlchter distribution for foreshocks. 
Our study of the ETAS model confirms that such 
a modification of the magnitude distribution before 
a mainshock must be expected when averaging over 
many foreshock sequences. 

Intuitively, the modification of the magnitude dis- 
tribution arises in our model from the increase of the 
aftershock rate with the mainshock magnitude. Any 
event has thus a higher probability to occur just af- 
ter a large event, because this large event induces an 
increase of the seismicity rate. The novel properties 
that we demonstrate is that, before a mainshock, the 
energy distribution is no more a pure power-law, but 
it is the sum of the unconditional distribution with 
exponent f3 and an additional deviatoric power-law 
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distribution with a smaller exponent /3' — [3 — a as 
seen from expression (51). In addition, we predict 
and verify numerically in figures 7 and 8 that the am- 
plitude of the deviatoric term increases as a power- 
law of the time to the mainshock. A similar behavior 
has been proposed as a precursory pattern termed 
"pattern upward bend" [Keilis-Borok et at, 2001] or 
alternatively providing "pattern 7" measured as the 
difference between the slope of the Gutenberg-Richter 
for low and for large magnitudes. According to our 
results, pattern 7 should increase from to the value 
a. 

According to the ETAS model, the modification 
of the magnitude distribution is independent of the 
mainshock magnitude, as observed by [Kagan and 
Knopoff, 1978; Knopoff et al. 1982; Molchan and 
Dmitrieva, 1990; Molchan et at, 1999]. Therefore, 
all earthquakes, whatever their magnitude, are pre- 
ceded on average by an increase of the rate of large 
events. Although the foreshock magnitude distribu- 
tion is no more strictly speaking a pure power-law but 
rather the sum of two power laws, a single power-law 
distribution with a decreasing 6-value as the time of 
the mainshock is approached is a simple and robust 
way to quantify the increasing importance of the tail 
of the distribution, especially for the short foreshock 
sequences usually available. This rationalizes the sug- 
gestion found in many works that a decrease in 6- value 
is a (retrospective) signature of an impending main- 
shock. The novel insight provided by our analysis 
of the ETAS model is that a better characterization 
of the magnitude distribution before mainshocks may 
be provided by the sum of two power law distribu- 
tions expressed by equation (51) and tested in syn- 
thetic catalogs in Figures 7 and 8. This rationalizes 
both the observed relatively small 6-values reported 
for foreshocks and the apparent decrease of 6-value 
when the mainshock approaches. Similarly to the in- 
verse Omori law, the modification of the magnitude 
distribution prior the mainshock is a statistical prop- 
erty which yields an unambiguous signal only when 
stacking many foreshock sequences. This may explain 
the variability of the patterns of 6-value observed for 
individual foreshock sequences. 

A modification of the magnitude distribution be- 
fore large earthquakes is also expected from the crit- 
ical point theory [Sammis and Sornette, 2002]. The 
energy distribution far from a critical point is char- 
acterized by a power-law distribution with an expo- 
nential roll-off. As the seismicity evolves towards the 
critical point, the truncation of the energy distribu- 



tion increases. At the critical point, the average en- 
ergy becomes infinite (in an infinite system) and the 
energy distribution follows a pure power-law distri- 
bution. This modification of the seismicity predicted 
by the critical point theory is different from the one 
reported in this study, but the two models yield an 
apparent decrease of 6-value with the time from the 
mainshock. Therefore, it is difficult to distinguish the 
two models in real seismicity data. However, the dif- 
ference between the two models is that a modification 
of the energy distribution should only be observed be- 
fore major earthquakes according to the critical point 
theory. Of course, one can not exclude that both 
mechanisms occur and are mixed up in reality. 

6.5. Implications for earthquake prediction 

The inverse Omori law and the apparent decrease 
of 6-value have been derived in this study as statisti- 
cal laws describing the average fluctuations of seismic- 
ity conditioned on leading to a burst of seismicity at 
the time of the mainshock. This does not mean that 
there is not a genuine physical content in these laws. 
We now demonstrate that they may actually embody 
an important part of the physics of earthquakes and 
describe the process of interactions between and trig- 
gering of earthquakes by other earthquakes. For this 
purpose, we use the modification of the magnitude 
frequency and the increase of the seismicity rate as 
predicting tools of future individual mainshocks. In 
the present work, we restrict our tests to the ETAS 
branching model used as a playing ground for our 
ideas. 

Using numerical simulations of the ETAS model 
generated with 6 = 1, a = 0.5/3, n = 1, too = 3 and 
9 = 0.2, we find that large earthquakes occur more 
frequently following a small locally estimated 6- value. 
We have measured the 6-value using a maximum like- 
lihood method for a sliding window of 100 events. 
For instance, we find that 29% of the large M > 6 
mainshocks occur in a 11% time period where /3 is 
less than 95% of the actual 6-value (that is 6 < 0.95). 
This leads to a significant prediction gain of g = 2.7, 
defined as the ratio of the successful prediction (29%) 
over the duration of the alarms (11%) [Aki, 1981]. A 
random prediction would lead g = 1. 

A much larger gain can be obtained using other 
precursory indicators related to the inverse Omori 
law. First, a large earthquake is likely to occur fol- 
lowing another large earthquake. For the same sim- 
ulation, fixing an alarm if the largest event within 
the 100 preceding earthquakes is larger that M = 6 
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yields a probability gain g = 10 for the prediction 
of a mainshock of magnitude equal to or larger than 
M = 6. Second, a large seismicity rate observed at a 
given "present" time will lead on average to a large 
seismicity rate in the future, and thus it increases the 
probability of having a large earthquake. Measuring 
the seismicity rate over a sliding window with flexi- 
ble length imposed to contain exactly 100 events and 
fixing the alarm threshold at 0.05 events per day, we 
are able to predict 20% of the M > 6 events with just 
0.16% of the time period covered by the alarms. This 
gives a prediction gain g = 129. 

Figure 10 synthesizes and extends these results by 
showing the so-called error diagram [Molchan, 1991; 
1997] for each of three functions measured in a slid- 
ing window of 100 events: (i) the maximum magni- 
tude M max of the 100 events in that window, (ii) the 
apparent Gutenberg-Richter exponent f3 measured on 
these 100 events by the standard Hill maximum like- 
lihood estimator and (iii) the seismicity rate r defined 
as the inverse of the duration of the window contain- 
ing 100 events. For each function, an alarm is declared 
for the next event when the function is either larger 
(for M max and r) or smaller (for [3) than a threshold. 
Scanning all possible thresholds constructs the contin- 
uous curves shown in the figure. The results on the 
prediction obtained by using these three precursory 
functions are considerably better than those obtained 
for a random prediction, shown as a dashed line for 
reference. We have not tried at all to optimize any 
facet of these prediction tests, which are offered for 
the sole purpose of stressing the physical reality of the 
precursory information contained in the foreshocks. 

6.6. Migration of foreshocks 

Among the proposed patterns of foreshocks, the 
migration of foreshocks towards the mainshocks is 
much more difficult to observe than either the inverse 
Omori law or the change in 6-value. This is due to 
the limited number of foreshocks and to the location 
errors. Similarly to other foreshock patterns, a few 
case-histories have shown seismicity migration before 
a mainshock. When reviewing 9 M > 7 shallow earth- 
quakes in China, Ma et al. [1990] report a migration 
of M > 3 — 4 earthquakes towards the mainshock 
over a few years before the mainshock and at a dis- 
tance of a few hundreds of kilometers. Less than 20 
events are used for each case study. While the case for 
the diffusion of aftershocks is relatively strong [Kagan 
and Knopoff, 1976, 1978; von Seggern et al., 1981; 
Tajima and Kanamori, 1985] but still controversial, 



the migration of foreshocks towards the mainshock 
area, suggested using a stacking method [e.g., Kagan 
and Knopoff, 1976, 1978; von Seggern et al., 1981; 
Reasenberg, 1985] is even less clearly observed. 

Using the ETAS model, Helmstetter and Sornette 
[2002b] have shown that the cluster of aftershocks dif- 
fuses on average from the mainshock according to the 
diffusion law R ~ t H , where R is the typical size of the 
cluster and H is the so-called Hurst exponent which 
can be smaller or larger than 1/2. In the present 
study, we have shown analytically and numerically 
that this diffusion of aftershocks must be reflected 
into a (reverse) migration of seismicity towards the 
mainshock, with the same diffusion exponent H (de- 
fined in (55)). We should however point out that this 
predicted migration of foreshocks, as well as the dif- 
fusion of aftershocks, is significant only over a finite 
domain of the parameter space over which the ETAS 
model is defined. Specifically, a significant spatio- 
temporal coupling of the seismicity leading to diffu- 
sion and migration is expected and observed in our 
simulations only for sufficiently large 0's and for short 
times \t c — 1\ < t* from the mainshock, associated with 
a direct Omori exponent p smaller than 1. This may 
explain why the diffusion of aftershocks and the mi- 
gration of foreshocks is often difficult to observe in 
real data. 

An additional difficulty in real data arises from 
the background seismicity, which can induce a spuri- 
ous diffusion of aftershocks or migration of foreshocks 
(see Figure 9c). As for the other foreshock patterns 
derived in this study, the migration of foreshocks to- 
wards the mainshock and the spatial distribution of 
foreshocks are independent of the mainshock magni- 
tude. These results disagree with the observations 
of [Keilis-Borok and Malinovskaya, 1964; Bowman et 
at, 1998] who suggest that the area of accelerating 
seismicity prior a mainshock increases with the main- 
shock size. An increase of the foreshock zone with 
the mainshock size may however be observed in the 
ETAS model when using foreshocks of type I (condi- 
tioned on being smaller than the mainshock) and in- 
troducing a characteristic size of the aftershock zone 
d in (54) increasing with the mainshock size. 

7. Conclusion 

We have shown that the ETAS (epidemic-type af- 
tershock) branching model of seismicity, based on the 
two best established empirical Omori and Gutenberg- 
Richter laws, contains essentially all the phcnomenol- 
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ogy of foreshocks. Using this model, decades of empir- 
ical studies on foreshocks are rationalized, including 
the inverse Omori law, the b- value change and seismic- 
ity migration. For each case, we have derived analyti- 
cal solutions that relates the foreshock distributions in 
the time, space and energy domain to the properties 
of a simple earthquake triggering process embodied 
by aftershocks. We find that all previously reported 
properties of foreshocks arises from the Omori and 
Gutenberg-Richtcr law when conditioning the spon- 
taneous fluctuations of the rate of seismicity to end 
with a burst of activity, which defines the time of the 
mainshock. The foreshocks laws are seen as statis- 
tical laws which are clearly observable when averag- 
ing over a large number of sequences and should not 
be observed systematically when looking at individ- 
ual foreshock sequences. Nevertheless, we have found 
that foreshocks contain genuine important physical 
information of the triggering process and may be used 
successfully to predict earthquakes with very signifi- 
cant probability gains. Taking these results all to- 
gether, this suggests that the physics of aftershocks is 
sufficient to explain the properties of foreshocks, and 
that there is no essential physical difference between 
foreshocks, aftershocks and mainshocks. 
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Appendix A: Deviations from the 
average seismicity rate 

Using the definition of X(t) (8), in the case where the 
external s(t) source term is a Dirac S(t), we obtain the 
following expression for the stochastic propagator 



K(t) = 8(t)+ 
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(Al) 

We now express the deviation of n(t) from its ensem- 
ble average K(t). This can be done by using (12), which 
means that the distribution density of earthquake ener- 
gies is constructed by recording all earthquakes and by 
counting the frequency of their energies. Thus, 5(E — E T ) 
can be seen as the sum of its average plus a fluctuation 
part, namely, it can be formally expressed as 8(E — E T ) = 
P(E) + 5P(E), where SP(E) denotes the fluctuation of 



S(E — E T ) around its ensemble average P(E). Similarly, 

n(t) = J2t <t S ( t ~ U ) = K i t ) + Sn ( t )' where Sn ( t ) is the 
fluctuating part of the seismic rate around its ensemble 

average K(t). 

We can thus express the sum of products of Dirac func- 
tions in (Al) as follows: 

5(E-Ei)8(t-U) = P(E)K(t)+S(Pn)(E,t) . (A2) 

i | ti<t 

As a first illustration, we can use the approximation that 
the fluctuations of the product S(E — E T ) Y] t , <t 5(t — U) 
can be factorized to write 

S(E - E t ) J2 S(t - U) = (P(E) + SP(E)) (K(t) + SK(t)) 

ti<t 

« P(E)K(t) + P(E) 6K(t) + K(t) SP(E) . (A3) 

Using expression (Al) for /t(t) and expression (15) for 
K(t), and putting (A3) in (Al), we then obtain 

t +oo 

K(t) = K(t) + J ' dr J dE <f> E {t-T)S(PK)(E,T) , (A4) 
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where 



6(Pk)(E,t) = 5P(E)K(t) + P(E)6n(t) . (A5) 

By construction, the average of the double integral in 
the r.h.s. of (A4) is zero. The double integral thus repre- 
sents the fluctuating part of the realization specific seismic 
response n(t) to a triggering event. Inserting (A4) in (22), 
we obtain 

A(t) = N(t)+ 

t t — T + OO 

J dr s(t) J du j dEcj> E (t-T-u)5(PK)(E,u) . (A6) 

— oo B 

Using dr J* T du — / +O ° du dr, expression (A6) 
reads 

\(t) = N(t)+ 

t — u 
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J dE j duS(Pn)(E,u) J dr s{t) <f> E (t - r - u) . 

E a — oo 

(A7) 

For instance, let us consider the first contribution 
SP(E)K(t) of 6(Pk)(E,t) given by (A5). Denoting 



e= J dE p{E) 8P{E) , 

Bo 

\{t) given by (A7) is of the form (23) with 

+ oo 

r/(r) = e j dx s(t — x) ^(x) , 
o 
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where ^(x) is the bare Omori propagator defined in (4). 

The only property needed below is that the stochastic 
process r)(r) be stationary. This is the case because the 
fluctuations of SP(E) and of the source s(t) are stationary 
processes. Similarly, the second contribution P(E)5k(t) 
of 5(Pk)(E, t) given by (A5) takes the form (23) if Sk(t) is 
a noise proportional to K (t). At present, we cannot prove 
it but this seems a natural assumption. More generally, 
one could avoid the decomposition of S(Pk)(E,t) given 
by (A5) and get the same result as long as 5(Pn)(E,t) is 
equal to a stationary noise multiplying K(t). 

Appendix B: Conditioning weighted 
power law variables on the realization 
of their sum 

Consider i.i.d. (identically independently distributed) 
random variables Xi distributed according to a power law 
p(xi) with exponent m < 2. Let us define the sum 



In the special case where all Ki's are equal, this gives the 
"democratic" result E[a:i|iSW] = Sn/N. 

For power law variables with distribution p(x) ~ 1 /x 1+m 
with m < 2, we can use the generalized central limit the- 
orem to obtain that Pn{X) converges for large N to a 
stable Levy law L m with index equal to the exponent m 
and scale factor X^Li [Gnedenko and Kolmogorov, 
1954; Sornette, 2000]: 
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The only dependence of Pn(Sn) in Ki is found in the 
scale factor. Putting the expression (B6) into (B5) yields 
the announced result (39). In particular, for m = 2, this 
recovers the standard result for Gaussian variables that 
E^ilSjv] ~ SNKi, because the stable Levy law of index 
m = 2 is the Gaussian distribution. 
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where the Ki's are arbitrary positive weights. Here, we 
derive that the expectation E[x»|ySjv] of Xi conditioned on 
the existence of a large realization of Sn is given by (39). 
By definition, E[xj|Sjv] = N/D where 

N = J dxi ... J dxN Xi p(xi)...p(xn) 8 ^Sn — KjXj^J 

(B2) 

and D is the same expression without the factor Xi. The 
Fourier transform of (B2) with respect to Sn yields 



N(k) = 
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We have used the identity J dxi Xi p{xi) e lkKiXi = 

Tk dK anc ^ P(fy i s t ne Fourier transform of p(x). Note 

that JljLi P(kKj) is nothing but the Fourier transform 

Ps{k) of the distribution Pn(Sn). Using the elementary 
identities of derivatives of Fourier transforms and by tak- 
ing the inverse Fourier transform, we thus get 
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By definition, the denominator D is identically equal to 
Pn{Sn). This yields the general result 
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Figure 1. An example of a realization of the ETAS model, which illustrates the differences between the observed 
seismicity rate n,(t) (noisy solid line), the average renormalized (or dressed) propagator K(t) (solid line), and the 
local propagator <t>E{t) (dashed line). The magnitude of each earthquake are shown in panel (b). This aftershock 
sequence has been generated using the ETAS model with parameters n — 1, a — 0.8/3, 8 = 0.2, mo = 2 and 
c = 0.001 day, starting from a mainshock of magnitude M = 7 at time t — 0. The global aftershock rate nit) is 
significantly higher than the direct (or first generation) aftershock rate, described by the local propagator <t>Eif)- 
The global aftershock rate n{t) decreases on average according to the dressed propagator K(t) ~ l/t 1_e , which is 
significantly slower than the local propagator (j>(t) ~ l/t 1+e . The best fit to the observed seismicity rate n(t) is 
indistinguishable from the average dressed propagator K{t). Large fluctuations of the seismicity rate corresponds 
to the occurrence of large aftershocks, which trigger their own aftershock sequence. Third-generation aftershocks 
can be easily observed. 
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Figure 2. Typical foreshock (a) and aftershock (b) sequences generated by the ETAS model, for mainshocks of 
magnitude M = 5.5 occurring at time t = 0. We show 11 individual sequences in each panel. The solid black line 
represents the mean seismicity rate before and after a mainshock of magnitude M = 5.5, estimated by averaging 
over 250 sequences. The synthetic catalogs have been generated using the parameters n = 1, 6 = 0.2, and a = 0.5/3, 
with a minimum magnitude threshold mo — 2. In contrast with the direct Omori law, which is clearly observed after 
any large mainshock, there are large fluctuations from one foreshock sequence to another one, and the inverse Omori 
law (with accelerating seismicity) is only observed when averaging over a large number of foreshock sequences. 
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Figure 3. Direct and inverse Omori law for a numerical simulation with a = 0.5/3 and 9 — 0.2 showing the two 
exponents p = 1 — 6 for aftershocks and p' = 1 — 26 for foreshocks of type II. The rate of aftershocks (crosses) and 
foreshocks (circles) per mainshock, averaged over a large number of sequences, is shown as a function of the time 
\t c — t\ to the mainshock, for different values of the mainshock magnitude between 1.5 and 5, with a step of 0.5. 
The symbol size increases with the mainshock magnitude. The truncation of the seismicity rate for small times 
\t c — t\ ~ 0.001 day is due to the characteristic time c = 0.001 day in the bare Omori propagator ^(t), and is the 
same for foreshocks and aftershocks. The number of aftershocks increases with the mainshock energy as N ~ E a , 
whereas the number of foreshocks of type II in independent of the mainshock energy. 
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Figure 4. Same as Figure 3 for a = 0.8/?, showing the larger relative ratio of foreshocks to aftershocks compared 
to the case a = 0.5/3. 
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Figure 5. Foreshock seismicity rate per mainshock for foreshocks of type II (circles) and foreshocks of type I 
(crosses), for a numerical simulation with n = 1, c = 0.001 day, 6 = 0.2, a — 0.5/3 and mo = 2. For foreshocks 
of type I, we have considered mainshock magnitudes M ranging from 3 to 6. We have rejected from the analysis 
of foreshocks of type I all mainshocks which have been preceded by a larger event in a time interval extending 
up to t — 1000 days preceding the mainshock. The rate of foreshocks of type II is independent on the mainshock 
magnitude M, while the rate of foreshocks of type I increases with M. For large mainshock magnitudes, the rate of 
foreshocks of type I is very close to that of foreshocks of type II. The conditioning that foreshocks of type I must be 
smaller than their mainshock induces an apparent increase of the Omori exponent p' as the mainshock magnitude 
decreases. It induces also an upward bending of the seismicity rate at times t « 1000 days, especially for the small 
magnitudes. 
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Figure 6. Exponents p' and p of the inverse and direct Omori laws obtained from numerical simulations of the 
ETAS model. The estimated values of p' (circles) for foreshocks and p (crosses) for aftershocks are shown as a 
function of 9 in the case a = 0.5 (a), and as a function of a//3 in the case 9 — 0.2 (b). For a/f3 not too large, 
the values of p' for foreshocks are in good agreement with the predictions p' = 1 — 26 for a/(3 < 0.5 (34) and 
p' = 1 — j3 9 /a for a/ (3 > 0.5 (43). The theoretical values of p' are represented with dashed lines in each plot, and 
the theoretical prediction for p is shown as solid lines. For a//3 not too large, the measured exponent for aftershocks 
is in good agreement with the prediction p = 1 — 9 (16). For a/ [3 > 0.5, both p and p'-values are larger than the 
predictions (16) and (43). For a/ j3 close to 1, both p and p' are found close to the exponent 1 + = 1.2 of the bare 
propagator ip(t). See text for an explanation. 
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Figure 7. Magnitude distribution of foreshocks for two time periods: t c — t < 0.1 days (crosses) and 1 < t c — t < 10 
days (circles), for a numerical simulation of the ETAS model with parameters 9 = 0.2, (3 = 2/3, c = 10~ 3 day, 
mo = 2 and a — (3/2 — 1/3. The magnitude distribution P(m) shown on the first plot (a) has been build by stacking 
many foreshock sequences of magnitudes M > 4.5 mainshocks. The observed magnitude distribution is in very 
good agreement with the prediction (51), shown as a solid line for each time period, that the magnitude distribution 
is the sum of the unconditional Gutenberg-Richter law with an exponent b = 1.5(3 = 1, shown as a dashed black 
line, and a deviatoric Gutenberg-Richter law dP(m) with an exponent b' = b — a = 0.5 with a = 1.5a = 0.5. 
The amplitude of the perturbation increases if t c — t decreases as expected from (51). The observed deviatoric 
magnitude distribution dP(m) is shown on plot (b) for the same time periods, and is in very good agreement with 
the prediction shown as a dashed black line. We must stress that the energy distribution is no more a pure power 
law close to the mainshock, but the sum of two power laws. The panel on the right exhibits the second power law 
which is created by the conditioning mechanism underlying the appearance of foreshocks. See text. 
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Figure 8. Same as Figure 7 but for a — 0.8/3. In this case, the deviatoric Gutenberg-Richter contribution 
is observed only for the largest magnitudes, for which the statistics is the poorest, hence the relatively large 
fluctuations around the exact theoretical predictions. 
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Figure 9. Migration of foreshocks, for superposed forcshock sequences generated with the ETAS model for two 
choices of parameters, (a) n = 1, 9 = 0.2, a = 0.5/5, = 1, d = 10 km, c = 0.001 day, m = 2 and (b) n = 1, 
= 0.02, a = 0.5/3, // = 3, d = 1 km, c = 0.001 day, to = 2. The distribution of foreshock-mainshock distances is 
shown on panel (a) and (b) for the two simulations, for different time periods ranging between 10 -4 to 10 4 days. 
The distribution of mainshock-aftershock distances given by (54) describing direct lineage is shown as a dashed 
line for reference. On panel (a), we see clearly a migration of the seismicity towards the mainshock, as expected 
by the significant diffusion exponent H = 0.2 predicted by (55). In contrast, the distribution of the foreshock- 
mainshock distances shown in panel (b) is independent of the time from the mainshock, as expected by the much 
smaller exponent diffusion H = 0.01 predicted by (55). The characteristic size of the foreshock cluster is shown 
as a function of the time to the mainshock on panel (c) for the two numerical simulations. Circles correspond to 
the simulation shown in panel (a) and crosses correspond to the simulation shown in panel (b). The solid line is 
a fit of the characteristic size of the foreshock cluster by R ~ t H . For the simulation generated with 9 — 0.2 and 
fj, = 1 (circles), we obtain H = 0.18 ± 0.02 in very good agreement with the prediction H = 6/fi = 0.2 (55). The 
simulation generated with 9 = 0.02 and fi — 3 (crosses) has a much smaller exponent H = 0.04 ± 0.02, in good 
agreement with the expected value H = 9/2 = 0.01 (55). A faster apparent migration is observed at large times 
for this simulation, due to the transition from the uniform background distribution for large times preceding the 
mainshock to the clustered seismicity prior to the mainshock. 
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Figure 10. Results of prediction tests for synthetic catalogs generated with the parameters a — 0.5/3, n = 1, 
j3 = 2/3, 9 = 0.2, c = 0.001 day and a constant source /j, = 0.001 shocks per day. The minimum magnitude is 
mo = 3 and the target events are M > 6 mainshocks. We have generated 500 synthetics catalogs of 10000 events 
each, leading to a total of 4735 M > 6 mainshocks. We use three functions measured in a sliding window of 
100 events: (i) the maximum magnitude M max of the 100 events in that window, (ii) the apparent Gutenberg- 
Richter exponent f3 measured on these 100 events by the standard Hill maximum likelihood estimator and (iii) 
the seismicity rate r defined as the inverse of the duration of the window. For each function, we declare an alarm 
when the function is either larger (for M max and r) or smaller (for /?) than a threshold. Once triggered, each 
alarm remains active as long as the function remains larger (for M max and r) or smaller (for 0) than the threshold. 
Scanning all possible thresholds constructs the continuous curves shown in the error diagram. The quality of the 
predictions is measured by plotting the ratio of failures to predict as a function of the total durations of the alarms 
normalized by the duration of the catalog. The results for these three functions are considerably better than those 
obtained for a random prediction, shown as a dashed line for reference. The best results are obtained using the 
seismicity rate. Predictions based on the Gutenberg-Richter [3 and on the maximum magnitude observed within 
the running window provide similar results. 



